1. (35%) Linear Regression Analysis for Wine Quality

(a) (10%) Show the results of regression analysis as follows

(b) (5%) The fitting of the linear regression is a good idea? If yes, why? If no, why? What’s the possible reason of poor fitting?

ans: In term of R-squared, 0.495, the data are not properly fitting of the linear regression. The reason may be there are too many variables and some of them even have high correlation.

(c) (5%) Based on the results, rank the independent variables by p-values and which one are statistically significant variables with p-values<0.01? (i.e. 重要變數挑選

From the result, we can inference f18, f2, f14, f15, f22, f17, f25, and f6 are statistically significant variables beacause their p-value is less than 0.01.

(d) (15%) Testify the underlying assumptions of regression (1) Normality, (2) Independence, and (3) Homogeneity of Variance with respect to residual.

2. (35%) Association Rule- Market Basket Analysis

(1) (10%) How to handle the raw dataset via data preprocessing?

ans:
step 1: load data from csv through pandas package.
step 2: saperate each item for evey line and put them in list.

Thus, we can get a 2-dimension list. For each sub-list in the list, it represents a certain transaction.

(2) (10%) What’s the top 5 association rules? Show the support, confidence, and lift to each specific rule, respectively?

ans:
The top 5 association rules ranking by lift is:

(3) (5%) Please provide/guess the “story” to interpret one of top-5 rules you are interested in.

ans:
I think people who bought chocolate tend to buy rolls/buns is pretty instinctive. It is a common combination no matter for breakfast or afternoon tea in western.

(4) (10%) Give a visualization graph of your association rules

3. (30%) Manufacturing System Analysis

(a) (10%)根據 Little’s Law,試計算各工作站的產出率 TH 於下表;試問瓶頸站的產出率 𝑟𝑏、最小生產週期時間(總加工時間,𝑇0)、關鍵在製品水準(𝑊0)各為多少?

(b) (10%)試給出最佳績效(best case)下,最大的產出率(THbest)與最小生產週期時間 (CTbest)的計算公式

If w <= 0.91:
(THbest, CTbest) = (w/2.4, 2.4)
else:
(THbest, CTbest) = (0.38, w/0.38)

(c) (10%)根據該問題的產線,試程式撰寫建立一模擬模型(或用套裝軟體、數值分析)來 驗證,當在製品 WIP 數量超過工廠產能時,其生產週期將嚴重惡化。也就是當產線的投 料速度(投產量)大於產線的產出率,此時生產系統將處於非穩態的狀態(non-steady state)。 試用圖表呈現 WIP、CT 與 TH 之間惡化的關係

After WIP increase to about 0.9, the value of TH became a constant while the value of CT increase.